Summary

The dashed green line marks the date of QI intervention.

Row

Demographic Parity

We measure demographic parity using the proportion of each demographic and age subpopulation that was given drug testing.
Goal: Achieve parity in all race and age groups

Row

Visual

Row

Predictive Parity

We measure predictive parity as the proportion of each demographic’s subpopulation that tested positive for THC or Non-THC drugs.
Goal: Achieve parity in all race groups and drug detection types

Row

Visual

Row

Equalized Odds

We measure equalized odds using the proportion of each demographic and age subpopulation that was given drug testing with a relevant order indication such as “Substance use during pregnancy, excluding marijuana” and “History of opioids prescribed during pregnancy”.
Goal: Achieve parity in all race and age groups

Row

Visual

Row

General Group Equity

We measure general group equity using the proportion of each demographic and age subpopulation that received intervention for the correct event.
Equation: (TP+FP)/(TP+FN)
Definition:
- True positive is # of patients tested positive for non THC substance and was reported to CPS
- False positive is # of patients tested negative for non THC substance and was reported to CPS
- True negative is # of patients tested negative for non THC substance and wasn’t reported to CPS
- False negative is # of patients tested positive for non THC substance and was reported to CPS
Goal: Achieve as close to a ratio of 1 as possible so that a group is not under-served (ratio < 1) or over-served (ratio > 1)

Row

Visual

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Row

Equal Outcomes for tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS with an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Statistics

Row

Demographic Parity

We measure demographic parity using the proportion of each demographic and age subpopulation that was given drug testing. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race and age groups

Row

Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.9 -2.5, -1.3 <0.001
    White -0.82 -1.0, -0.61 <0.001
Age group


    Under 25 0.00
    25 - 30 -0.19 -0.43, 0.04 0.11
    30 - 34 -0.59 -0.87, -0.31 <0.001
    Over 34 -0.60 -0.88, -0.32 <0.001
1 OR = Odds Ratio, CI = Confidence Interval

Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.3 -2.2, -0.51 0.003
    White -0.48 -0.81, -0.15 0.004
Age group


    Under 25 0.00
    25 - 30 -0.62 -1.1, -0.18 0.006
    30 - 34 -0.08 -0.49, 0.33 0.7
    Over 34 -0.43 -0.90, 0.01 0.061
1 OR = Odds Ratio, CI = Confidence Interval

Row

Predictive Parity

We measure predictive parity as the proportion of each demographic’s subpopulation that tested positive for THC or Non-THC drugs. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race groups and drug detection types

Row

Non-THC Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.6 -4.5, 0.05 0.12
    White 0.64 0.25, 1.0 0.001
Age group


    Under 25 0.00
    25 - 30 0.64 0.17, 1.1 0.007
    30 - 34 0.87 0.33, 1.4 0.002
    Over 34 0.86 0.31, 1.4 0.002
1 OR = Odds Ratio, CI = Confidence Interval

Non-THC Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.6 -4.5, 0.05 0.12
    White 0.64 0.25, 1.0 0.001
Age group


    Under 25 0.00
    25 - 30 0.64 0.17, 1.1 0.007
    30 - 34 0.87 0.33, 1.4 0.002
    Over 34 0.86 0.31, 1.4 0.002
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.74 -2.3, 0.50 0.3
    White -0.99 -1.4, -0.55 <0.001
Age group


    Under 25 0.00
    25 - 30 0.19 -0.25, 0.64 0.4
    30 - 34 -0.16 -0.71, 0.38 0.6
    Over 34 -0.40 -0.99, 0.17 0.2
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.74 -2.3, 0.50 0.3
    White -0.99 -1.4, -0.55 <0.001
Age group


    Under 25 0.00
    25 - 30 0.19 -0.25, 0.64 0.4
    30 - 34 -0.16 -0.71, 0.38 0.6
    Over 34 -0.40 -0.99, 0.17 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS report only. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race groups

Row

Non tested mothers’ Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.4 -2.8, -0.37 0.021
    White -1.3 -1.9, -0.72 <0.001
Age group


    Under 25 0.00
    25 - 30 -0.07 -0.67, 0.53 0.8
    30 - 34 -0.28 -0.98, 0.38 0.4
    Over 34 -0.52 -1.3, 0.19 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Non tested mothers’ Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.4 -2.8, -0.37 0.021
    White -1.3 -1.9, -0.72 <0.001
Age group


    Under 25 0.00
    25 - 30 -0.07 -0.67, 0.53 0.8
    30 - 34 -0.28 -0.98, 0.38 0.4
    Over 34 -0.52 -1.3, 0.19 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.03 -1.4, 1.2 >0.9
    White 0.16 -0.23, 0.56 0.4
Age group


    Under 25 0.00
    25 - 30 0.29 -0.16, 0.74 0.2
    30 - 34 0.38 -0.15, 0.91 0.2
    Over 34 0.39 -0.15, 0.92 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.03 -1.4, 1.2 >0.9
    White 0.16 -0.23, 0.56 0.4
Age group


    Under 25 0.00
    25 - 30 0.29 -0.16, 0.74 0.2
    30 - 34 0.38 -0.15, 0.91 0.2
    Over 34 0.39 -0.15, 0.92 0.2
1 OR = Odds Ratio, CI = Confidence Interval
---
title: "Fairlabs Dashboard"
header-includes:
  - \usepackage{comment}
output:
  flexdashboard::flex_dashboard:
    theme: cosmo
    css: style.css
    orientation: rows
    vertical_layout: scroll
    source_code: embed
---
```{r, include=FALSE}
library(flexdashboard)
library(tidyverse)
# library(lubridate)
library(plotly)
library(knitr)
library(janitor)
library(DT)
library(reshape2)
require(scales)
library(ggplot2)
library(gtsummary)
library(htmltools)
```

```{r}
#TODO:

# data definitions to external file? json or yaml well it's tabs now
# function for transforming/cleaning data based on definitions done
# function for calculations of summaries and stats based on grouping variables? I'm not going to do that

```

```{r}
# cleans raw data based on user defined input columns

clean_data <- function(df, input){

  # Pull user defined columns

  # dates
  delivery_date_in <- input[input$column_out=='delivery_date','column_in']
  cps_reporting_date_in <- input[input$column_out=='cps_reporting_date','column_in']
  uds_collection_date_in <- input[input$column_out=='uds_collection_date','column_in']
  uds_test_in <- input[input$column_out=='uds_test','column_in']
  maternal_birth_date_in <- input[input$column_out=='maternal_birth_date','column_in']
  # static QI intervention date from dictionary 
  intervention_date_in <- as.Date(input[input$column_out=='intervention_date','column_in'])

  # filled if non-THC and THC are stored as individual drug columns
  non_thc_col <- strsplit(input[input$column_out=='non_thc_cols','column_in'], split ='\\\\t')[[1]]
  thc_col_in <- input[input$column_out=='thc_col','column_in']
  # filled if non-THC and THC are stored as true/false 
  non_thc_detect_in <- input[input$column_out=='non_thc_detect','column_in']
  thc_detect_in <- input[input$column_out=='thc_detect','column_in']

  maternal_age_in <- input[input$column_out=='maternal_age','column_in']
  maternal_race_in <- input[input$column_out=='maternal_race','column_in']

  order_indication_in <- input[input$column_out=='order_indication','column_in']
  # Used to determine which order indication is for non-THC drug use
  ord_indict_non_thc_in <- strsplit(input[input$column_out=='ord_indict_non_thc','column_in'], split ='\\\\t')[[1]]

  # Checks if each column exist, and format for summary and visual uses
  if (delivery_date_in == "") {
    df$delivery_date <- NA
    df$delivery_month <- NA
  } else {
    df$delivery_date <- as.Date(df[[delivery_date_in]])
    df$delivery_month <- as.Date(format.Date(df$delivery_date, '%Y-%m-01'))
  }

  if (cps_reporting_date_in == "") {
    df$cps_reporting_date <- NA
    df$cps_report <- NA
  } else {
    df$cps_reporting_date <- as.Date(df[[cps_reporting_date_in]])
    df$cps_report <- if_else(is.na(df$cps_reporting_date), FALSE, TRUE)
  }

  if (is.na(intervention_date_in)) {
    df$pre_post_QI <- NA
  } else {
    df <- df %>%
      mutate(pre_post_QI = factor(if_else(delivery_date >= intervention_date_in, 'Post', 'Pre'), 
                                  levels = c('Pre', 'Post')))
  }

  if ((uds_collection_date_in == "" & uds_test_in != "")) {
    df$uds_collection_date <- NA
    df$uds_test <- df[[uds_test_in]]
  } else if (uds_collection_date_in != "") {
    df$uds_collection_date <- as.Date(df[[uds_collection_date_in]])
    df$uds_test <- if_else(is.na(df$uds_collection_date), FALSE, TRUE)
  } else {
    df$uds_collection_date <- NA
    df$uds_test <- NA
  }

  if (thc_detect_in != "" & thc_col_in == "") {
    df$thc_detect <- df[[thc_detect_in]]
  } else if (thc_col_in != "") {
    df$detected_tetrahydrocannabinol <- df[[thc_col_in]]
    df$thc_detect <- if_else(df$detected_tetrahydrocannabinol==1, TRUE, FALSE)
  } else {
    df$thc_detect <- NA
  }

  if (non_thc_detect_in != "" & is_empty(non_thc_col)) {
    df$non_thc_detect <- df[[non_thc_detect_in]]
  } else if (!is_empty(non_thc_col)) {
    df <- df %>%
      mutate(non_thc_detect = if_else(rowSums(df[non_thc_col]) > 0, TRUE, FALSE))
  } else {
    df$non_thc_detect <- NA
  }

  if (maternal_age_in != "") {
    df$maternal_age <- df[[maternal_age_in]]
    df <- df %>%
      mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                                maternal_age < 30 ~ '25 - 30', 
                                maternal_age < 34 ~ '30 - 34', 
                                TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else if (maternal_birth_date_in != "") {
    df$maternal_birth_date = as.Date(df[[maternal_birth_date_in]])
    df$maternal_age <- floor(as.numeric(difftime(delivery_date, maternal_birth_date, units = 'days'))/365.25)
    df <- df %>%
    mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                              maternal_age < 30 ~ '25 - 30', 
                              maternal_age < 34 ~ '30 - 34', 
                              TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else {
    df$maternal_age <- NA
    df$age_group <- NA
  }

  if (maternal_race_in != "") {
    df$maternal_race <- df[[maternal_race_in]]
    df <- df %>%
      mutate(race_group = case_when(maternal_race == 'Black or African American' ~ maternal_race, 
                                   maternal_race == 'White' ~ maternal_race, 
                                   TRUE ~ 'Other'))
  } else {
    df$maternal_race <- NA
    df$race_group <- NA
  }

  if (order_indication_in == "") {
    df$ord_indict_non_thc <- NA
  } else {
    df$order_indication <- df[[order_indication_in]]
    df$ord_indict_non_thc <- if_else(df$order_indication %in% ord_indict_non_thc_in, TRUE, FALSE)
  }

  df
}

```

```{r}
data <- read.csv("/mnt/home/jzhang92/fairlabs/fairlabs_data.csv")
# data <- read.csv("/mnt/home/jzhang92/fairlabs/fairlabs_local_data_j.csv")
input <- read.csv("/mnt/home/jzhang92/fairlabs/fairlab_input_dict.txt", sep = '\t')
# input <- read.csv("/mnt/home/jzhang92/fairlabs/fairlab_input_dict_l.txt", sep = '\t')

race_pal <- setNames(object = c("#66c2a5","#fc8d62","#8da0cb"), nm = c('Black or African American', 'White', 'Other'))
age_pal <- setNames(object = c("#f0f9e8","#bae4bc","#7bccc4", '#2b8cbe'), nm = c('Under 25', '25 - 30', '30 - 34', 'Over 34'))
intervention_date <- as.Date(input[input$column_out=='intervention_date','column_in'])

data <- clean_data(data, input)
```


```{r}
# data summary 

## demographic parity
demo_sum <- data %>%
  group_by(race_group,age_group,pre_post_QI,delivery_month,uds_test) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,age_group,pre_post_QI,delivery_month) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup() 

## predictive parity
pred_non_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,pre_post_QI,delivery_month,non_thc_detect) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,pre_post_QI,delivery_month) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup() 

pred_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,pre_post_QI,delivery_month,thc_detect) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,pre_post_QI,delivery_month) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup() 

## equalized odds depends on order indication data
if (sum(is.na(data$ord_indict_non_thc))==nrow(data)) {
  equal_odds_wide <- NA
  error_ord <- "The data set is missing order indication data. The dashboard cannot create this visual or table."
} else {
  error_ord <- NA
  ## equalized odds
  equal_odds <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,ord_indict_non_thc,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(ord_ind_detect = factor(case_when(ord_indict_non_thc & non_thc_detect ~ 'TP',
                                      ord_indict_non_thc & !non_thc_detect ~ 'FP',
                                      !ord_indict_non_thc & !non_thc_detect ~ 'TN',
                                      !ord_indict_non_thc & non_thc_detect ~ 'FN')))
  
  equal_odds_wide <- pivot_wider(equal_odds%>%
    select(c("race_group","age_group","pre_post_QI",'ord_ind_detect', 'n')), names_from = ord_ind_detect, values_from = n) %>%
    mutate(FPR = (FP/(FP+TN)), 
           TPR = (TP/(TP+FN)), 
           ratio = FPR/TPR)
}

## equal outcome and group benefit depends on cps report data 
if (sum(is.na(data$cps_report))==nrow(data)) {
  equal_out <- NA
  group_benefit_wide <- NA
  error_cps <- "The data set is missing CPS report data. The dashboard cannot create this visual or table."
} else {
  error_cps <- NA
  ## equal outcomes
  equal_out <- data %>%
    # filter(!is.na(cps_reporting_date)) %>%
    group_by(uds_test, race_group, non_thc_detect,cps_report, pre_post_QI,delivery_month) %>%
    count() %>%
    ungroup() %>%
    group_by(uds_test, race_group,pre_post_QI,delivery_month) %>%
    mutate(total = sum(n),
           perc_total = round(n/total, digits = 2)) %>%
    ungroup()
  
  ## group benefit equality
  group_benefit <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,cps_report,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
                                         cps_report & !non_thc_detect ~ 'FP',
                                         !cps_report & !non_thc_detect ~ 'TN',
                                         !cps_report & non_thc_detect ~ 'FN')))
    # mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
    #                                         !cps_report & non_thc_detect ~ 'FP',
    #                                         !cps_report & !non_thc_detect ~ 'TN',
    #                                         cps_report & !non_thc_detect ~ 'FN')))
  
  group_benefit_wide <- pivot_wider(group_benefit %>%
    select(c("race_group","age_group","pre_post_QI",'cps_detect', 'n')), names_from = cps_detect, values_from = n) %>%
    mutate(ratio = ((TP+FP)/(TP+FN)))
}
```

```{r, child = 'visuals.rmd'}
```

```{r, child = 'section.rmd'}
```